A Linear Programming Approach to Nonstationary Infinite-Horizon Markov Decision Processes
نویسندگان
چکیده
Nonstationary infinite-horizon Markov decision processes (MDPs) generalize the most well-studied class of sequential decision models in operations research, namely, that of stationaryMDPs, by relaxing the restrictive assumption that problem data do not change over time. Linearprogramming (LP) has been very successful in obtaining structural insights and devising solutionmethods for stationary MDPs. However, an LP approach for nonstationary MDPs is currentlymissing. This is because the LP formulation of a nonstationary infinite-horizon MDP includescountably infinite variables and constraints, and research on such infinite-dimensional LPs hastraditionally faced several hurdles. For instance, duality results may not hold; an extremepoint may not be a basic feasible solution; and in the context of a Simplex algorithm, a pivotoperation may require infinite data and computations, and a sequence of improving extremepoints need not converge in value to optimal. In this paper, we tackle these challenges andestablish (1) weak and strong duality, (2) complementary slackness, (3) a basic feasible solutioncharacterization of extreme points, (4) a one-to-one correspondence between extreme points anddeterministic Markovian policies, and (5) devise a Simplex algorithm for an infinite-dimensionalLP formulation of nonstationary infinite-horizon MDPs. Pivots in this Simplex algorithm usefinite data, perform finite computations, and generate a sequence of improving extreme pointsthat converges in value to optimal. Moreover, this sequence of extreme points gets arbitrarilyclose to the set of optimal extreme points. We also prove that decisions prescribed by theseextreme points are eventually exactly optimal in all states of the nonstationary infinite-horizonMDP in early periods.
منابع مشابه
A linear programming approach to constrained nonstationary infinite-horizon Markov decision processes
Constrained Markov decision processes (MDPs) are MDPs optimizing an objective function while satisfying additional constraints. We study a class of infinite-horizon constrained MDPs with nonstationary problem data, finite state space, and discounted cost criterion. This problem can equivalently be formulated as a countably infinite linear program (CILP), i.e., a linear program (LP) with a count...
متن کاملExtreme point characterization of constrained nonstationary infinite-horizon Markov decision processes with finite state space
We study infinite-horizon nonstationary Markov decision processes with discounted cost criterion, finite state space, and side constraints. This problem can equivalently be formulated as a countably infinite linear program (CILP), a linear program with countably infinite number of variables and constraints. We provide a complete algebraic characterization of extreme points of the CILP formulati...
متن کاملLinear programming formulation for non-stationary, finite-horizon Markov decision process models
Linear programming (LP) formulations are often employed to solve stationary, infinitehorizon Markov decision process (MDP) models. We present an LP approach to solving nonstationary, finite-horizon MDP models that can potentially overcome the computational challenges of standard MDP solution procedures. Specifically, we establish the existence of an LP formulation for risk-neutral MDP models wh...
متن کاملA stochastic programming approach for planning horizons of infinite horizon capacity planning problems
Planning horizon is a key issue in production planning. Different from previous approaches based on Markov Decision Processes, we study the planning horizon of capacity planning problems within the framework of stochastic programming. We first consider an infinite horizon stochastic capacity planning model involving a single resource, linear cost structure, and discrete distributions for genera...
متن کاملA new solving approach for fuzzy multi-objective programming problem in uncertainty conditions by using semi-infinite linear programing
In practice, there are many problems which decision parameters are fuzzy numbers, and some kind of this problems are formulated as either possibilitic programming or multi-objective programming methods. In this paper, we consider a multi-objective programming problem with fuzzy data in constraints and introduce a new approach for solving these problems base on a combination of the multi-objecti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Operations Research
دوره 61 شماره
صفحات -
تاریخ انتشار 2013